28 research outputs found
Probabilistic -mean with local alignment for clustering and motif discovery in functional data
We develop a new method to locally cluster curves and discover functional
motifs, i.e.~typical ``shapes'' that may recur several times along and across
the curves capturing important local characteristics. In order to identify
these shared curve portions, our method leverages ideas from functional data
analysis (joint clustering and alignment of curves), bioinformatics (local
alignment through the extension of high similarity seeds) and fuzzy clustering
(curves belonging to more than one cluster, if they contain more than one
typical ``shape''). It can employ various dissimilarity measures and
incorporate derivatives in the discovery process, thus exploiting complex
facets of shapes. We demonstrate the performance of our method with an
extensive simulation study, and show how it generalizes other clustering
methods for functional data. Finally, we provide real data applications to
Berkeley growth data, Italian Covid-19 death curves and ``Omics'' data related
to mutagenesis.Comment: 22 pages, 6 figures. This work has been presented at various
conference
The shapes of an epidemic: using Functional Data Analysis to characterize COVID-19 in Italy
We investigate patterns of COVID-19 mortality across 20 Italian regions and
their association with mobility, positivity, and socio-demographic,
infrastructural and environmental covariates. Notwithstanding limitations in
accuracy and resolution of the data available from public sources, we pinpoint
significant trends exploiting information in curves and shapes with Functional
Data Analysis techniques. These depict two starkly different epidemics; an
"exponential" one unfolding in Lombardia and the worst hit areas of the north,
and a milder, "flat(tened)" one in the rest of the country -- including Veneto,
where cases appeared concurrently with Lombardia but aggressive testing was
implemented early on. We find that mobility and positivity can predict COVID-19
mortality, also when controlling for relevant covariates. Among the latter,
primary care appears to mitigate mortality, and contacts in hospitals, schools
and work places to aggravate it. The techniques we describe could capture
additional and potentially sharper signals if applied to richer data
Peak shape clustering reveals biological insights
Background: ChIP-seq experiments are widely used to detect and study DNA-protein interactions, such as transcription factor binding and chromatin modifications. However, downstream analysis of ChIP-seq data is currently restricted to the evaluation of signal intensity and the detection of enriched regions (peaks) in the genome. Other features of peak shape are almost always neglected, despite the remarkable differences shown by ChIP-seq for different proteins, as well as by distinct regions in a single experiment. Results: We hypothesize that statistically significant differences in peak shape might have a functional role and a biological meaning. Thus, we design five indices able to summarize peak shapes and we employ multivariate clustering techniques to divide peaks into groups according to both their complexity and the intensity of their coverage function. In addition, our novel analysis pipeline employs a range of statistical and bioinformatics techniques to relate the obtained peak shapes to several independent genomic datasets, including other genome-wide protein-DNA maps and gene expression experiments. To clarify the meaning of peak shape, we apply our methodology to the study of the erythroid transcription factor GATA-1 in K562 cell line and in megakaryocytes. Conclusions: Our study demonstrates that ChIP-seq profiles include information regarding the binding of other proteins beside the one used for precipitation. In particular, peak shape provides new insights into cooperative transcriptional regulation and is correlated to gene expression
Predicting Railway Wheel Wear under Uncertainty of Wear Coefficient, using Universal Kriging
Railway wheel wear prediction is essential for reliability and optimal maintenance strategies of railway systems. Indeed, an accurate wear prediction can have both economic and safety implications. In this paper we propose a novel methodology, based on Archard's equation and a local contact model, to forecast the volume of material worn and the corresponding wheel remaining useful life (RUL). A universal kriging estimate of the wear coefficient is embedded in our method. Exploiting the dependence of wear coefficient measurements with similar contact pressure and sliding speed, we construct a continuous wear coefficient map that proves to be more informative than the ones currently available in the literature. Moreover, this approach leads to an uncertainty analysis on the wear coefficient. As a consequence, we are able to construct wear prediction intervals that provide reasonable guidelines in practice
COVID-19 effects on the Canadian term structure of interest rates
In Canada, COVID-19 pandemic triggered exceptional monetary policy interventions by the central bank, which in March 2020 made multiple unscheduled cuts to its target rate. In this paper we assess the extent to which Bank of Canada interventions affected the determinants of the yield curve. In particular, we apply Functional Principal Component Analysis to the term structure of interest rates. We find that, during the pandemic, the long-run dependence of level and slope components of the yield curve is unchanged with respect to previous months, although the shape of the mean yield curve completely changed after target rate cuts. Bank of Canada was effective in lowering the whole yield curve and correcting the inverted hump of previous months, but it was not able to reduce the exposure to already existing long-run risks
COVID-19 effects on the Canadian term structure of interest rates
In Canada, COVID-19 pandemic triggered exceptional monetary policy interventions by the central bank, which in March 2020 made multiple unscheduled cuts to its target rate. In this paper we assess the extent to which Bank of Canada interventions affected the determinants of the yield curve. In particular, we apply Functional Principal Component Analysis to the term structure of interest rates. We find that, during the pandemic, the long-run dependence of level and slope components of the yield curve is unchanged with respect to previous months, although the shape of the mean yield curve completely changed after target rate cuts. Bank of Canada was effective in lowering the whole yield curve and correcting the inverted hump of previous months, but it was not able to reduce the exposure to already existing long-run risks
Multiple FLR models for fixed vs. polymorphic ETn.
<p>See explanations for <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004956#pcbi.1004956.t003" target="_blank">Table 3</a>.</p